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Evaluation  of  an  Observation-based  Climatology  Model  for  Predicting 
Visibility  for  Oata-void  i.ocations  in  Cermanv 


by 

S.  J.  Bean  and  P.  N.  Somerville 
University  of  Central  Florida 


1.0  INTRODUCTION 

1 . 1  Review  of  Problem 

A  goal  of  the  Air  Weather  Service  has  been  to  achieve  a  capability  to 
determine  the  climatic  probability  of  above-threshold  conditions  of  the  weather 
relative  to  the  success  of  an  Air  Force  flight  mission,  anywhere,  at  any  time, 
expeditiously.  Such  a  capability  would  materially  heighten  the  effectiveness  of 
a  weapon  system,  since  it  is  well-known  that  the  environment  can  both  degrade  and 
enhance  system  effectiveness. 

One  method  of  summarizing  or  compacting  the  huge  volume  of  historical 
records  is  by  means  of  empirical  cumulative  distribution  functions  (cdf’s).  An 
empirical  cumulative  distribution  function  is  simply  the  tabulated  cumulative 
relative  frequencies,  or  probabilities  that  a  given  variable  will  fall  below 
specified  values.  Somerville  and  Bean  (1979)  have  demonstrated  that  a  number  of 
climatological  variables  may  be  modeled  with  closed  form  distribution  functions. 
For  a  given  location  and  time  (e.g.,  month  and  hour)  the  historical  observations 
can  be  used  to  estimate  specific  model  parameters. 

Somerville  and  Bean  (1981)  used  the  Weibull  distribution  to  model 
visibility  in  Germany  for  30  stations.  The  cumulative  distribution  function  for 
the  Weibull  is  given  by 

R 

„ .  ,  ,  -ax 

F (x)  =  1  -  e 


whore  n  mil  R  are  roust  ;int  s .  Values  of  a  and  R  are  derived  for  each  station, 
for  each  montit  and  each  ol  the  eight  3-hour  periods  of  the  day.  The  probability 
oi  visibility  less  than  x  miles  is  then  obtained  by  substitution  in  F(x). 

1.?  Methods  of  Extending  Visibility  Probabi 1  it i es  to  Data-Yoid  Regions 

A  more  diflieult  problem  is  to  develop  models  which  can  be  used  to 
estimate  probabilities  for  locations  where  records  presently  do  not  exist.  Somer¬ 
ville  and  Bean  (Idgl)  developed  two  models  for  Germany.  Thirty  stations  for  which 
visibility  records  were  available  were  used  in  the  development  of  the  models.  For 
each  of  the  stations,  for  a  specified  month  and  hour  period,  "he  Weibull  distribution 
was  used  to  model  visibility.  That  is,  a  value  for  each  of  n  and  S  was  obtained. 
Having  obtained  the  values  for  u  and  R,  these  values  were  regressed  on  a  set  of 
variables  which  were  thought  to  have  a  possible  influence  on  visibility.  These 
included  elevation,  elevation  relative  to  the  average  elevation  on  a  circle  whose 
radius  is  30  kilometers  from  the  station,  east-west  and  north-south  elevation 
difference,  population  density,  relative  humidity,  proximity  to  a  major  body  of 
water,  mean  wind  speed,  mean  precipitation,  latitude,  longitude,  and  functions  of 
and  intei act  ions  between  the  above.  A  stepwise  regression  program  was  used  to 
select  which  of  the  variables  could  be  used  as  predictors  for  a  specified  month 
and  hour  period.  Finally*  a  specially  designed  least  squares  non-linear  regression 
program  was  used  to  simultaneously  determine  the  regression  coefficient  in  the 
formulas  for  x  and  R.  This  model  was  named  the  "variables  model."  The  coefficients 


and  the  regression  models  are  given  it:  the  above  referenced  paper. 
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A  second  model,  named  the  "constants  mode!"  was  also  developed.  Here 
only  month  and  time  of  day  were  considered.  That  is-,  no  information  regarding 
elevation,  humidity,  wind  speed,  etc.,  was  used.  For  each  month,  and  tins-  ot 
day,  non-linear  repression  was  used  to  determine  rite  "best"  values  for  i  and  :  . 

These  were  also  tabulated  in  Somerville  and  Bean  (1981). 

In  either  model,  extension  of  the  climatology  was  accomplished  by 
obtaining  values  for  ot  and  f  at  the  data-void  station,  and  then  using  the  We  i  Ini  I  I 
distribution  to  determine  the  desired  visibility  probabilities. 

In  this  manuscript  we  will  restrict  ourselves  to  the  evaluation  of  the 
constants  model.  Evaluation  of  the  variables  model  will  be  given  in  a  future  report. 

2 . 0  EVALUATION  OF  THE  CONSTANTS  MODEL 

Two  methods  were  used  to  evaluate  the  model.  First,  30  stations  were 
used  as  a  "calibration"  set,  and  the  resulting  constants  model  was  used  on  a  second 
independent  "evaluation"  data  set  of  30  West  German  stations.  Second,  sample  re-usi 
was  used  to  evaluate  the  constants  model.  From  the  60  evaluations  one  "overall" 

RMS  (Root  Mean  Square)  of  the  method  was  obtained.  Sample  re-use  (somet  imes  called 
cross-validation)  is  a  relatively  new  technique  which  makes  it  possible  to  use  the 
same  set  of  data  for  "calibration"  and  "evaluation".  Briefly,  if  there  is  a  total 
of  n  stations,  n  separate  solutions  are  obtained.  For  each  solution,  one  station 
is  used  as  the  evaluation  set  with  the  remaining  n-1  used  as  the  calibration  set . 
Using  the  method  one  can  obtain  the  Root  Mean  Square  (RMS)  error  of  the  modeling 
procedure  for  each  station.  The  individual  RMS  errors  mav  then  he  combined  to 


w 
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obtain  an  overall  RMS  error  Tor  the  procedure.  For  future  use  the  recommended 
mode)  is  I  he  line  usin>',  all  the  stations  for  the  calibration  set.  For  a  good 
account  of  the  sample  re-use  technique,  papers  by  Stone  (1974)  and  Geisser  (1975) 
are  recommended . 

The  i  irst  evaluation  method  gave  some  encouraging  results,  and  it 
also  indicated  some  wavs  wo  may  improve  the  model.  Exhibit  2.1  gives  the  RMS 
of  tiie  probability  estimates  for  each  station  taken  over  all  months  and  hour  periods 
will)  tiie  exception  of  hour  periods  1  and  2.  The  data  for  these  early  pre-dawn  hours 
are  frequently  missing,  and  they  were,  therefore,  eliminated  from  the  study. 

Stations  4,  13,  and  2b  stand  out  as  very  poor  fits  which  inflate  the  overall  RMS 
considerably.  These  throe  stations  are  much  higher  in  eievation  than  the  other 
-.tat  ions  in  t  he  study.  Also,  they  are  much  higher  than  the  surrounding  area.  These 
factors  seem  to  give  rise  to  much  different  visibility  conditions  than  the  other 
areas  in  the  study. 

Exhibit  2.2  gives  the  resulting  RMS  values  averaged  over  all  stations 
in  the  evaluation  set  tor  each  month  and  hour  periods  3  through  8.  The  overall 
RMS  oi  .108  compares  favorably  with  the  calibration  data  set  overall  RMS  of  .063. 

The  mode!  obviously  does  not  fit  as  well  on  the  evaluation  set,  and  we  certainly 
con  Id  not  expect  that  it  would. 
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Exhibit  2.1 

Overall  RMS  For  Each  of  The  10  Stations 
in  The  Evaluation  Set 
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Exhibit  2.2 

RMS  Over  All  Stations  in  the  Evaluation  Set 
By  Month  and  Hour  Period  (LST) 
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The  results  for  the  sample  re-use  evaluation  were  eomparah  1  e  to  those 
ter  the  first  studv.  Tlte  RMS  values  for  each  of  the  60  stations  averaged  over  a!! 
months  and  hour  periods  3  through  8  are  given  in  Kxhihit  2.  1.  The  RMS  values  over 
all  60  stations  for  all  months  and  hour  periods  1  through  8  are  given  in  Kxhihit  7 
The  RMS  values  correspond int;  to  sample  re-use  (Kxhihit  2.4)  are  general !v  smaller 
than  the  results  using  the  second  30  stations  for  evaluation  (Kxhihit  7.2).  This 
is  mainly  due  to  the  larger  sample  size  in  the  sample  re-use  evaluation.  The 
sample  re-use  evaluation  makes  use  of  5d  stations  to  build  a  model,  whereas  the 
first  procedure  makes  use  of  only  30  stations  in  the  miviel  estimation 

Kxhibit  2.5  gives  the  values  of  u  and  1-  for  the  constants  mo  .  lor 
each  month  and  hour  period  where  all  60  stations  are  used  for  the  cal  -n  set. 

Exhibit  2.6  shows  the  location  in  Ovrmanv  which  were  used  in  developing 
the  models  evaluated  in  this  report. 
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Exhlblt  2.  6  West  German  Stations 


i.ii  roM'i  CS  IONS  AM)  KKt  '<  iMMI'.NOAT  IONS 

fuc  i misi.mts  model  has  been  shown  to  give  gi.od  results  lor  estimating 
v  i  s  i  b  i  1  1 1  v  arob.it>  i  I  it  ies  with  the  exception  ot  some  higher  elevation  areas.  Becaus 
the  variables  model  makes  use  of  a  number  of  characteristics  of  the  data-void 
location,  in.  Iodine,  that  of  elevation,  it  is  expected  to  improve  the  fit  over  the 
constants  model. 

One  problem  with  the  variables  model,  however,  is  that  many  of  the  input 
variables  mav  he  as  hard  to  obtain  as  information  about  visibility.  Two  other 
models  should  be  invest i.nited  to  see  if  it  is  possible  to  improve  on  the  constants 
model  and  at  the  same  time  not  require  too  much  information:  one  is  a  model  that 
uses  only  t opo.t raph i ca 1  variables  such  as  elevation  and  average  elevation  of  the 
surrounding  area;  another  is  a  model  based  on  cluster  analysis.  That  is,  the 
known  (i,h)  parameters  might  be  used  to  determine  regions  of  homogeneous  visibility 
character ist ics.  Constants  models  could  then  be  used  on  the  individual  regions 
and,  as  betore,  sample  re-use  could  be  used  to  evaluate  the  results. 
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